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KEY SCHEDULER FOR ENCRYPTION APPARATUS 
USING Dmii ENCRYPTION STANDARD ALGORITHM 

i ■ ! ! 
Field of ithe InyeTition 



The present invention relates to a key scheduler for an 
I j ' 
encryption apparatus; and, more particularly, to a key 

scheduler for anjsi-round encryption apparatus using dara 
encryption standard algorithrri.. 



Description of the Prior Art 



DES (Data Encryption Standard) algorithm has come to the 
this environment of the vider usage or 
networks. Especialjly, the DES is widely used in Internet 
security applicationjs, remote access server, cable modem or 
satellite moddm. | 

The DES Is f indamentally a 64-bit block cipher having 
54-bit block inputj and output, 5 6 bits among the 64-bit key 
block fcr enciryption' and decryption and remaining 8 bits for 
parity checking. Tke' DES receives a 64-bit plain text block 
and outputs a! 64-iiit cipher text generated from the 64-bit 
plain text bldck aM the 5S-bit key. 

In a major' tiechnique. the DES is implemented by 
permutation (P-BokI substitution (S-Box) and key schedule 
generating a subkey, , 

Inside of dati (encryption is implemented in such a way 



1 



to iteration :of lj6 j round operations' and constructed by an 
initial permutation (IP) of inpur part and an inverse initial 

,il . 

permutation (IP"M fpf output part. 

Fig, 1 is a lolock diagram of a general DES architecture. 



Referring tol Fig. 1, rhe general DES architecture 
includes an initial' permutation unit 110, a DES encryption 

unit 120 and an inverse initial permutation unit 130, 

'H ' 

In the DES encryption unit 120, 64-bit plain text block 
'! ': 

undergone an IP unit is divided into two blocks, respectively 

i j I 

registered at a i^irst left register (Lo) and a first right 
register (Ro) . At ''every round, 32-biT: data registered at the 
left register anA| the right register undergoes a product 
transformation aha i a block transformation. The inverse 



■ 

initial permutaticjn , unit 130 performs the inverse initial 

- '^i ! 

permutation (JP"')|. ] of 64-bit data transformed by 16-round 
operation and outp||ts a cipher text block. 

The basic opjeration unit 120 includes a plurality of 
cipher function units 121 and exclusive-OR (X-OR) units 122. 

32-bit data irfegistered at the first: riaht register (Lo) 
is encrypted by the cipher function unit f 121 using the sub- 
key (Ka) from a key - scheduler and the encrypted 32-bit daua 

' 'if ' 

is X-ORed with the| 32-bit data registered at the first left 



register (Lo) 'at tiljiei X-OR unit 122, 32-bit data from the X- 
OR unit 122 IS registered at a right register (Ri) and the 
32-bit data negispered at the first right register (Ro) is 

swapped and registered at a left register (Li) in a next 

M ! 

round, which is arefierred as '^one round operation' . In DES 



■I 



architecture, 16 irbiind operations are performed by itaranion 
of one round operation. 

15-round operation can be expressed as equation (1) and 
(2) . , 



i=R<-i 1=1, 2, ... 16 



Ri=Li-i ® f(R^Lii, Ki) i=l, 2, ... 16 (2) 



10 Fia, 2 'is block diaoram of' a conventional key 

scheduler generating a sub key. • 

Referrincf tojj Fig, 2^ the conventional key scheduler 
includes a first permutation choice (PCI) unit 200, a first 
and a second sh4.f;t units 220 and 230, and a second 



15 permutation choice 



(PC2) unit 240. 



The firs,t permutation choice [FCl) unit 200 performs 
permutation of 56,4bit key data. The permutated 5 6 bit key 
data is divided i|two 2 8 -bit blocks, and the blocks are 
registered in' regljsters Co and Do. Each of the shift units 
20 220 and 230 respectively shifts corresponding 23 bits 
registered in Ci aid; Di (1=0^ 1, , , 15), The shifted key 
data blocks are repiistered in a next round registers Ci+i and 

Di+i- .The secohd x^^ermutation choice (PC2) unit 240 performs 

.'J ■ 
permutation of 2o7-jbil: blocks registered in the registers Ci 



25 and Di to output a| 



4 8 -bit subkey Ki- 



During IS-round operation, the key | data blocks of and 
Di are shifted byjps bits, such that the data registered in 



Co and Do are equaij to those registered in C^g and Dis- 

Fig- 3 is a jd^tailed diagram of a: cipher function unit 
and a S-Box permutation unit of a general DES architecture- 

'I i 

Referring to^^Fig. 3, the cipher function f includes an 

!^ 

5 expansion permutatlion unit 310, an exclusive-OR (XOR) unit 

'i! 

320, an S-Box perifijutation unit 330, a P-Box permutation unit 
340 and an XOR uniit 350. 

The expansiorl permutation unit 3i0 performs expansion 
permutation over ^32-bit data (Rd-u) from .a right register 

10 registering 32-bit- text block to output ' 46-bit data. 

4 

The XOR unit!l320 performs XOR operation over the 48-fcit 

ji 

data from the expansion permutation unit 310 and a subkey 
(Ki) from a key scheduler. 

The S-Box permutation unit 330 . performs substitution 

'^5 ; 
15 over 48-bit data from the XOR unit 320 to output 32-bit data. 

The P-Box permutation unit 340 performs permutation over 

:|' 

32-bit data from the S-Box permutation unit 330. 

The XOR unit (350 performs XOR operation over 32-bit data 
from the P-Box pelfmutation unit 340 and 32-bit data (L^-i)) 
20 from a left registrar. 

The key schej:|uler incluaes a rirst permutation choice 
(PCI) unit 360, two shift units 370 and 380 and a second 
permutation choice^ (PC2) unit 33 0, Each of the shift units 
160 and 170 respectively shifts corresponding 28 bits, half 
25 of 56-bit key 'dat^j? 

The PC2 'unitjj 390 receives two blocks from the shift 
units 160 and 170 too compress therti to the sub key. 

■ ii 

4 



f ' ! 

In particulari the 5-Box permutation unii: 330 includes 8 
S-Boxes for receiving 48-bit data and outputting 32-bit data* 
That is, 4S-bit daLal block is divided into 8 6-bit data, each 
applied to the corlresponding S-Box of ,the 8 S-Boxes and each 
5 of the 8 S-Boxesi I outputs 4-bit data,' Accordingly, 48-bit 
data is permutated >to 32-bit data. The S-Box permutation 

:^ ■ 

unit 330 requires !'ja memory, e.g., a prograrrjnable logic array 
(PLA) or a read only memory (ROM) , bec'ause it employs table 
look-up technique'.'! Since each of i:he S-Boxes outputs 4 bits 

H 

10 for 6- bit input, ;'5 it requires 64 x 4 memory capability and 

I 'I • 
i ' 

the S-Box permuta^ipn unit 130 requires 8 x 64 x 4 merrtory 

capability. Acco.idlngly, the S-5ox '/permutation unit 330 

''3 ! 

takes relatively l^rge area in a chip. 

For imr>lementjina the convent2.onal DS3 algorithm which 

' ' V 

15 iterates the identical operation by 16 times, one round 



20 



op 



eration is refer|red as a basic operation unit and the DBS 



architecture is implemented by using IS -basic operation units. 
However, an unrolijed loop architecture which uses one of 2 
through 16 round reoperations as a basic operation unit is 
introduced and hjas more attention.' The unrolled loop 
architecture efficiently reduces tiirie i;nargin, slack between 
the basic operation units by combining the coerations of the 
basic operation un?Lt|S and reduces size 'of a chip by using a 

boundary optimizirig- combination. Since the unrolled loop 

] 

25 architecture compu,^es two round cperatidns at one clock cycle, 
encryption can b4 ^performed within .eight clock cycles/ 
however, two S-boxi permutation units, which take relatively 



I i 



20 



25 



large area in- a chap,, are necessary. 

Fig. 4 'is iL blcck diagram of a conventional DES 



architecture usindr 

Referring ta| 

■I 

'asing an unrolled 



an unrolled loop cipfher function. 

Fig. 4, the conventional DES architecture 

t 

loop cipher function includes an initial 



permutation unxt 'AOO, multiplexers 410 and 420, combination 

logic units 430 arid' 440, registers 450 and 460, and a final 

tl 

permutation unit 470. 

; i ; 

The initial permutation unit 400 permutes data and key 
■|l ' 

10 blocks. The multiplexer 410 selects One of the data block 
from the initial ';permutation unit 400 or a data block fed 
back from the register 450. The multiplexer 420 selects one 
cf the key block "If rpm the initial permutation unit 400 or ' a 
key block fed back ' from the register 450. The combination 

^ 11'. 

15 logic unit 430 peirtdrms an odd round of encryption operarion 
over the data| blc^ok ' and the key block!' from the multiplexers 

410 and 420, The cpmbination logxc unit 440 performs an even 

• H ' 

round of encryptiojiii operation over the -data block and the key 



block from the cofnbiination logic unit 430, The registers 450 
and 4 60 store the! data block and the key block from the 
combination logib ; unit 440 respectively. The finial 
permutation uiiiit ,4|70 generates a cipher text block from the 
data block from th|e register 450- 



Figs . 5A ands 

I 1 

logic units of that 



5B are block diagrams of the combination 

i 

unrolled loop cipher' 'function unit. 

Referring tol!Pitgs. 5A and 5B, the unrolled loop cipher 
i I 1 : ■ 

function unit 'incllides the combination 'logic units of Fig. 4^ 



10 



15 



20 



St 



which are circuits implementing two rounci operations of the 

■ : Ij ' 

DES algorithrr.. Foirj a clock cycle, the key scheduler generates 

two subkeys ancis Knr and the unrolled loop cipher function 

! il 

unit performs two ;round operations of the DES algorithm 



having two cipher | 



function units f^i andi fn and two exclusive- 



OR (XOR) operation units, by using the keys and Kn- In 
other words, the u'hrolled loop cipher ■ function unit receives 

. If ■ . 

two subkevs Ki andjKh and output data blocks from registers A 

]] , 

and B, and. outputs operation results Rc and Rd to 

■ !l 

corresponding to registers at a next clock. 

1 1 

Fig. 6 i3 a jletailed block diagram of the conventional 
unrolled loop'cioher function unit. 

• '"li ■ 

Referring tojlFig. 6, the unrolled^ loop cipher function 

unit includes ' two Cipher function units- The cipher function 

unit includes ' aji]L expansion permutation unit 610, an 



exclusive-OR .(X-ORl) units 620 and 650/ a S-Box permui::ation 
unit 630, a ?-Eox'!permutation unit 640. 

it ■ 

A 32-bit ' dataj block from the register Rb is expanded to 
43-bit block by trie expansion permutation unit 610. The 43- 



1 



bit block is X-'OR^^d ; with a subkev K^^ from the key scheduler 



by the X-OR unit 



The 4a-bit data block is stored in and 



substituted into |;p-bit data block by the S-Box permutation 
unit 630. The- 32-|^it data block from the S-Box permutation 
unit 630 is permutfated by the P-Box permutation unit 640. The 

25 32-bit data block jifrom the P-Box permutation unit 640 is X- 

: ■ 1, 

ORcd with a 32-bifT data block from the register R^^ by the X- 



OR unit 650 and tr. 



? 32-bit data block from the X-OR unit 650 



is stored iB U^gister Rc. The lunrolled loop cipher 

function unit iillLdes one mo^e cipher function which has 
the same element 'jis, mentioned above and outputs another 32- 



bit data block to] 



,h|e register Rd^ 



if ■ 

If the 32-bijti data blocks R21-3, ^^-2 stored xn the 



registers A and Bi; 

I 

the key scheduler'^ 



and two subkeys K2J1' ^21 are provided by 
32-bit data blocks Rai-ir R21 are computed 



for one clock cyc|p by equations (3) and (4). 

I 1 1 

\ . t 

I. ; ) 

R2l-l=R2i-3 © f|fR2.-2. K2i-l) i=l, 2, .;. . e 

'1' ■ \ 



(3) 



(4) 



Pi^ 7 is 11 : block diagram ofj a conventional key 
scheduler having t|o, key scheduling units. 

Referring toih'3. 7, the key scheduler includes two key 
scheduling units lick having a first P^mutation choice unit 
700, two registsri'j 7il0 and 720, shift units 730 and 740, and 
a second permutatiten; choice unit 750. 

in a first jliy', scheduling unit, the first permutation 
choice unit 700 pejifirms permutation ofMSS-bit key data block. 
Each cf registers I (cL 710 and (D.) 720| stores 28 bits, haJ.f 
of 56-bit key datJ block in response tio a clock (CLK) - The 
shift units 730 anjd 740 respectively shifts corresponding the 
28-bit key data bljlcks from the registers by a predetermined 
number of bits e'^.l two, three, or fbur bits. The second 
permutation choicij Jnit 750 receives tJ.o 28-bit key blocks 



frora the regilstersS (0^) 710 and (D„J 720 and generates a sub- 

; , + 

key Kj^. A secoA<^| 'key scheduling unit includes the same 

eleraents and generiStes a sub^key Kn- ' 

11 

For eight ii^ounds, the first and the second key 

i 

5 scheduling units '^Respectively generate' subkeys Kai-i and K2i.. 
In other word3, tKe first key scheduling unit shifts the key 
block by a piredeftermined number of bits, e.g., one, two, 

three or four; bitjs for ^ight clock cycles so that the total 

i \\ [ 
number of acc'iumulated shifted bits are- 4, 8, 12, 15, 19, 23 

10 and 27. The secorijd key schedulincr unit: shifts the kev block 

"II ^ ' 

by two, three or 5our bits for eight clock cycles so that the 

M 

number of accumulated shifted bits are.!2, 6, 10, 14, 17, 21, 

U r 

25 and 28. : M 

I V 

While the keV scheduler generating a subkey for one 

! ij| 

15 clock cycle of Figl 2 includes two registers and two shifters, 

! '{1 

the key scheduler, T^^generating two subkeys for one clock cycle 
as mentioned jabov*^ needs four registers and four shifters, 
which takes a large area in a chio- Therefore, there is a 

I V ^ ! 

problem in a larg^4 size of the encryption apparatus due to 



20 the registers land jihe shifters. 



Summary of th^ Invention 



Therefore, id*^ is an object of the i present invention to 

;,| ■: 

25 provide a key 'scheduler having a small size. 

-J 

In accordanceil with an aspect of the present invention, 
there is provided -jl key scheduler for an apparatus using DSS 



10 



15 



20 



25 



encrypt 



ion algoritjhm^ comprising: a fi.rst permutation choice 



unit, for penr.uri'rlg i a 56-bit block; ,a first register for 
storing 
permuta 
second 



left 28 bitis amona the 56-bit block from the first 



tion choice! 



unit in accordance with a clock signal; a 

't 

register 4^r| storing right 23 jbifs among the 56-bit 
'i! i ■ 
block irom the first permutation choice unit in accordance 

with the clock signal; a first and a Second shift units for 

l! ' 1 
shiftin:? che 2e-bit blocks stored in the first and the second 



registe 



rs to the lef|t by a first predeterrriined number of bits 



and outputting shifted 28-bit blocks ;to the first and the 
second registers ^respectively; a second permutation choice 
unit fo 
second 
and a 
stored 

second bredeterm 



r permuting the 28 bits stored; in the first and the 
registers, i||:h:ereby generating a ifirst subkey; a third 
fourth- shilfti units, each for -shifting the 23 bits 
in the firsT:' and the secor^d registers to left by a 
Tiinea 



the fou::th shifter 



number of bits; and a third permutation 



choice unit for pe'rmlating the 28 bits stored in the third and 



s,: thereby generating ;a second subkey. 



Bri 



of pre 



ef Descripjtion of the Drawings 



The above and 



other objects and features of the instant 



invention will bedpme apparent from the Ifollowing description 



accompanying drawi'hgs, in which 

biioik di 



Fig, 1 is a bd 



erred emblddiiTLents taken in Conjunction with the 



agram of a general DES architecture; 



Fig, 2 is a bjock diagram of a key; scheduler generating 



10 



10 



15 



20 



25 



il 

!1 

a sub-key; : | 

Fig, 3 is a Ibjilock diagram illustrajcing a cipher function 
and a S-Box permutation uait of a general DES architecture; 

Fig. 4 is a blcck diagram of a DES[ architecture using an 

' ■(] : 
unrolled loop cipl^^erj function unit; ; 

Figs. 5A' and(i5B are block diagrams of corrJoination logic 

1^^ ' 'i 
units of the unrolled loop cipher function unit; 

Fig, 6 is a jtolock diagram of the! unrolled loop c:Lpher 

function unit; \j ' 

\\ 

Fig. 7 J is |a 'block diagram ofi a conventicnal key 

scheduler having tiwo^ key scheduling units; 

■ ii " i 
Fig. 8 is block diagram of s a key scheduler in 

I' Hi 

accordance with oii4 embodiment of the present invention; 

! I ' 

Fig. 9 !is i block diagram of \ a key scheduler in 
M ! 
accordance with another embodiment of tl{ie present xnvention; 



Fig, 10 is a Stiming diagram for explaining operations of 



iler si 



the key scheduler \0f\ the present invention; 



Fig, 11 is a j&lock diagram of a DES architecture using a 

' i 
a' time multiplexed cipher function unit in 



macro pipeline and! 



accordance with th^^e present invention 



■A 



macro pipeline anck' 



Fig. 12 lis a 5block diagram of a DES architecture using a 



] 3 



an unrolled loop cipher function unit in 
accordance with thi^ present invention; ' 

. iii = 1 

Fig. 13 is a timina diagram for explaining operations of 

; I i ; 

DES architecture using a macro pipeline; and an unrolled loop 
cipher functioin un;hjt; and ! 



Fig. 14 is a,j 

: 1 



timing diagram illustrating effect of the 



11 



DES architecture 'Jusing a macro pipeline in accordance with 

■ 'II ■ 

the present inventfLon. ■ . 



Preferred Emfepdiment of the Invention 

Kereinafter,'! preferred embodiments of the present 
invention will be described in detail; with reference to the 

accompanying drawings, 

.|} . 

Fig, 8 is diagram of a key scheduler having a 

f> ■ ; 
key scheduling unijt in accordance with one embodiment of uhe 

1. 'il ' 
present mvention^jj 



Referring to'^.jFig. 8, the key scheduler includes a first 

permutation clfioicel (PCI) unit 800, two: registers 810 and 82D, 

four shift unit3;"j 830, 840, 860 and 370, and two second 

permutation choicejunits (PC2) 850 and 680* 

j 

The first (permutation choice unit 800 performs 
permutation of a S^-jbit key data block'. Each of registers (C^i, 
Drct) 810 and 820 store left and right 28 bits, half of 5 6-bit 
key data block in ^Jresponse to a clock (CLK) respectively. The 

i i 

shift units 830 aid* 340 shift corresponding the 28-bit key 

•'I ' 

data from the ' regiistprs 810 and 820 by a predetermined number 

of bits e.g., t!wo, three, or four bits. The second 

i! ' 

permutation choic^j unit 850 receives -two 28-bit key blocks 

from the registers (Cm/ D^.) 810 and 820 and generates a sub- 

key KjT,. The shift Itunits 860 and 870 s'hift corresponding the 

d] ■ 

28-bit key data block from the registers 810 and 820 by a 

predetermined niarnSeir of bits e.g., ; one or two bit(s) 



12 



respectively.. The; jsecond permutation choice -unit 850 receives 

two 28-bit key blocks from the shif-c units 8 60 and 87 0 and 

generates a sub-k4|/ iKr.- 

For eight rotors, the key scheduler computes a subkey 

K2i-i in i-th round by using a key scheduling unit. The 

registers (Qa, D^^U ^10 and 820 receive.^ and store an initial 

> 

key from the f irs^ ; pemutation choice unit 800 or the key 

■ [3 : 

block shifted, by ||ie shift units 830 and 840 at a next clock 
cycle- In each roulnd, the shift units shift the key block by 
a predetermined n-oinber of bits Sra/ e.g.^ 3, 4^ 4, 3, 4, 4, 4^ 
2 (1) bits. ' . ': 

y 

As shown' in f'Fia. Br a relation between a total n'umlDer 
\] " 

TSm Of shifted bij^s ' for obtaining a subkey Kai-i and a total 
number TSn of ; shifted bits for obtaining a subkey Kz^ in i-th 

round is expressed.] ajs : TS^ - TSir, = D^. 

; '? 

The number o^f bits shifted in the shift units 8 30 and 

M : 

840 at each round iis described in tables of Fig. 8, 

\ |l 

In the first^; round (Fo) . a difference value D^, between 

'J ■ 

TSn and TS^is-l. Ip. ithe eighth round (P7) . i.e,, wnen storing 
a new initial keyl ;the difference value is 0, and when 
plain text block's 'are encrypted by using the same key 

; 

iterazively, 'the ikifference value Dm', is 1. In the other 
rounds (Pi to Ps)^ ! the difference value Dj^ is 2. Using 

• ;| i 

additional two shifters 860 and 870 and the secona 
permutation choicej liinit 880 implemented by wiring, the key 
scheduler computed 1 t'|he subkeys Ka^-i and' K2i in i-th round and 
outputs and K^. .| . 



Fig. 9 is a;:plock diagram of a key scheduler having a 
key scheduling unbt in accordance with another einbodiment of 
the present inventjion. 

Referring tq||Fig. 9, the key scheduler includes a first 
permutation choice^ (PCI) unit 900, two registers 910 and 920, 
four shift unitsji 930, 940, 960 and, 910^ and two second 
permutation choice (PC2) unit 950 and 980. 

The first ||permutation choice unit 900 performs 

:] : 
permutation of a '5^6-bit key block. Each of registers (C^., 

910 and 920 store|i 28 bits, half of the 55-bit key block in 

response to a cloik (CLK) , Each of the shift units 930 and 

1 1 

940 shifts dorr^sponding the 28-bit, key block from the 

registers 910 and ^'^2.0 by a predetermined number of bits e.g-, 

V ' 

two, three, or four bits. The second permutation choice (PCI) 

' 1 ' 

unit 950 receivesfl two 2o-bit key blocks from the registers 
(Cnr DrJ 910 and 920^ and generates a sub-key K^- Each of the 

'! I 

shift units 960,^1970 shifts corresponding the 28-bit key 

block from the registers 910 and 920 by a predetermined 

M 

number of bits ||e.g,, one or two bit(s). The second 
permutation choic^ !(PC2) unit 950 receives two 28-bit key 
blocks from the shifters 960 and 970 and generates a suokey 



The key sched^aler of Fig. 9 computes a subkey Kz^ in i- 



th round by usingiJthe second key scheduling unit of Fig. 7. 

■J 

As shown in a table' of Fig. 9, a relation between a total 
number TS^ of shif;ted bits for obtaining a subkey Ksi-i ana a 
total number TS^ od shifted bits for obtaining a subkey K2i in 



i-th round are expjressed as: TS^- TSn = Dn- 

In the fiTsb round (Pc) and ths . eighth round (Pi), a 
difference value |n Is -1. In the other; rounds (Pi to Pe) f the 
difference valuejj Dn is -2. Using additional two right 
shif-e^-s 9S0 'andijsTO and the second permutation choice unit 
98 0 implemen-ted m wiring, the key scheduler computes the 
subkeys Kzi arid KijJ-i in i-th round and outputs subkeys Kn and 



Fia. 10 is a 



the key scheduler! 
Referring to 



timing diagram illustrating operations of 



Fig. 10, Kia and Kn d'enote access times ro 



the subkeys in a-^ound DES architecrurd . TS^ and TSn denote a 
total number bf siii:^ted bits of the initial key block after 
the first permutation choice unit (PCI). 3;, and aenote 
shiftecll 



numbers of si 



bits in each round (Pi) in order to obtain 



the total numbers ijif; shifiied bits described in TSx^ and TSn- 
Processes fori aeneratina the subke,^ will be described. 
In a first roind (Po) , since TSr^ an;d T3n are 1 and 2, the 

' 1! ■ ' 

subkeys Ki and K2;,are generated by shifting the initial key 
block from the p"tj^ 'by one and two bits and permuting the 

shifted block 'thrcf\kgh the PC2. ; 

•I 

In a second ijfuhd (Pi), since TSi, and TSn are 4 and 6, _in 



order to genejrat^ 
units shifts \ the 



the subkeys K3 and K4. each of the shift 
key block stored : in the corresponding 



register to idft ji[y 3 (=4-1) and 4 (=6-2) bits. 

In a third rJlnd {P2). since TS^:; and TS^ are S and 10. in 



order to genelcatei 



the subkeys K5 and. Kg, each of the shift 



units shifts thij Uy block snored jin the corresponding 
register to Ifeft I^y |4 (-8-4) and 4 (-10j-€) bits. 

In each rolkd (Pi) , the Icey blocks stored in the 
corresponding reaijsters are shifted to left by S^a and Sn bits, 
and the key blocks] are shifted by TS^=2|7 and TSn=28 (=0) in an 
eighth round (P7)il jThen, in order to! return to the first 
round, i.e., Tsjl land TSn=2, Sr, and iSn should be two [2] 
respectively, U | 

Generally, tlere are lots of data blocks to be encrypted 
with compared to Ik I given key in many ! cases . At tnxs time, 
nerfcrmance of encryption can be increased by using a 
pipeline structui^p. j Pipelines used in; the DES architecture 
are classified as||a!micro pipeline and a macro pipeline in 
accordance with a \eyel to which is applied. 

A data input'.||E:ate to the DES encr\[prion unit is decided 
based on a speed 'f>f\ a whole encryption! system rather than a 



'l1 1 ' 
speed of the DES e^icryption unit. In case of DES architecture 

used for netvorklligt the period of the macro pipeline is 

decided in accordance with a maximum transmission rate of a 

modulator and a dexrJdulator, and a speeid of an external host 



microprocessor. 

In general, jUj data input/outpulr speed of the DES 

encryption unit i^l slow. Since the data lis moved byte-by-byce 

,M ^ 1 

(8 bits) in cutsibe 1 of the DES encryption unit and the DES 
encryption unit pitflDrms encryption of the 64-bit data block 
and outputs encrypted 64-bit data blockl, there are necessary 



an inpur register ijinci an output registe't. In order to reduce 



latency of input/output , the encryption apparatus of the 
present inventi.on1 uses a macro pipeline including an input 
process (first sta^e) , a DES operation process (second stage) 
and an output probess (third stage) . A period of the macro 
pipeline is determined by a itiaximura value among times for 



input/ output and 



DES operation of the*' data. When the times 



for the input, tlr^i output and the DES [operation of the data 
are id^nticai, tfife i macro pipeline structure has a maximum 

effect on reduction of the latency. ; 

, (I : , 
Fig. 11 is^ ajblock diagram of a DES architecture using a 

J i 
macro pn^eline andij a time multiplexed cipher function unit to 

which the present ||B-nyention is applied.; 

Referring to||Fig, 11, the macro pipeline includes three 

stages. In a ,firs|t btage, input data "stream is divided into 



eight S-bit bflock^. 



inputted, gathered 



every four 8-bit blocks are sequentially 
and stored into !a left input buffer 



register (IBh^(L)}j| 1110 and a right input buffer register 

t fir ^ 

(IBR(R)) 1120; ln\h second stage, each. 32-bit data block from 
the left ahd (the right input buffer registers is 

: il ; 

alternatively inpiitted zo a first ' and a second cipher 
function unirs anc3i| encrypted for 8 rounds* In a third stage r 
each 32-bit data btlock is divided into ifour 8-bit blocks and 

■ If'. 

outputted by; 8-bit block through a| left output buffer 

1140 and a right output buffer register 



register (OBR/(L) ) \ 
(OBR(R)) 1150. 



The timei multi 

Si 



iiplexed cipher function unit receives the 
32-bit blQcks > f romi the registers AO and; BC and subkeys and 



Kfi from tha 'key [scheduler. For a front half of the first 
clock, the cipheij^ function unit f^ is , operated, for a latter 
half of the firstjj clock, the cipher function fs is operated. 



In other words^ ithe time multiplexed;: cipher - function unit 

0 

5 receives the 32-bJL!jt: 'block from the register AO and the subkey 
Ka, performs cioher function over thei reoister AO and the 
subkey Ka through,! the expansion permutation unit, the XOR 

unit, the S-Box p:ermutation unit andi:the P-Box permutation 

!^ ^ 

unit, and output^s 'the 32^bit block, fwhich is the cipher 

IC function cperatio|ij result. Similarly ; the time multiplexed 
cipher function ,"unit receives the 32-bit block from the 

register BO and J the subkey Kb^ and) ou'cputs the cipher 
function operation;* result. 

n : 

Tne key scheduler of the present invention generating 
15 two subkeys K^i-i ikd K^i for one clock' 'cycle can be used for 

the 8 round DESflj architecture using ;the time multiplexed 

h ■ 

cipher function uriit.. 

il : 

Fig. 12 is a iblock diagram of a DES architecture using a 



macro pipeline anci an unrolled loop cipher function unit to 

\} 

20 which the present iinyention is applied/' 

H ' 

Referring to|jFig. 12, the macro pipeline includes three 
stages. In a firsji stage, input data stream is divided into 
eight S^bit blocks!; every four 8-bit blocks are sequentially 
inputted, gai:hereJ4 s^rid stored into la left input buffer 



25 register (IER(L 



IjZlO and a right input buffer register 



(IBR(R)) 1220. In !| second stage, each 32-bit data block from 
th^ left and the r^ighi: input buffer registers is encrypted by 
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20 



25 



the unrolled loo^l cipher function unic for S rounds. In a 
third stage, '.eacK) 32-bit data block is divided into four S- 
bit blocks and outputted by 8-bit block through a left output 



buffer register^ liOBRd)) 1260 and a right output buffer 



\] 1 



register (03R;(R 



iiiu'l 



70, 



The key schepuler of the present! invention generating 



i1 I 

two subkeys Kzi-i ;a.nd (i=l, 2r 



i) for one clock 
cycle can be .usedrifor the S round DES architecture using the 
time multiplexed jcipher function unit ;and rhe unrolled loop 
cipher function ui^ti. 

ii • 

The convent io'jfia'l key scheduler having two key scheduling 
units includes foyJtr i registers storing shifted results of the 



initial key block/^ v^ich take large area in a chip. 

In i-th ; round J a total nuir^ber of shifted bits of the 



15 initial key block 



one or two bit (s) 



for obtaining the subkeys K21-1 and K2i are 



Tjbe key schedulers of Figs. 8 and 9 can be 

n ' 

iraplemented by U5in<^ one of the key scheduling units of Fig. 
7- The key schedulers of Figs. 8 and 9 , compute another pairs 



of subkeys Ks^ and;j|K2i-i from the key block used for obtaining 

the subkeys Kzi--' and Kai by using additional two shifters and 

■ " 1 : ■ 

a permutation cho!ii^c4 unit (PC2) , Since the total number of 

■ ■ i-i ! : 

the shifted bits does not become 0 (=28) in case of the key 

iiM : 

scheduler of Fig. fllfi I using the first key scheduling unit, an 



r 



additional device jis ! necessary when the initial key is stored. 



Therefore, size o 



1 



scheduling unit. 



the key scheduler of Fig. 8 is larger than 



that of the key scheduler of Fig. S using only the second key 



^ijnice the unrolled loop cipher function 
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10 



15 



20 



25 



starts the cipher' 
the icey schedule^ 



the right shiftel^, 



function operation by using the subkey Kzl-i, 
o-e Fig, 9 obtains the subkey K2i-i by using 
, ' a number of threshold paths are larger 



Fig. 13 is a 
DES architecture 



than that of the >5e^ scheduler of Fig. .6, 



timing diagram for explaining operations or 
using a macro pipeline and an unrolled loop 
cipher function unit* 

Referring to^ Fig. 13, the DES architecture receives 
initial permuted jpljain text (yo/ 2o) /■ (ao/ bo) / (cq/ do) in 
order and compute^j ix, bi. di (i=l. 2, . . . , 16) and outputs 

For easy desjiription; process of computing bi from (ao, 
bo) and outputtii)^ ' (bie, bis) will be; described. A 64-bit 



plain text block [^f ter initial permutation is divided into 

two 32-bit blocks, jfeo' and bo- In ouher words, ao = Lq ^ R-i, and 

•t I • 

bo = The DBS^ encryption unit ; computes values bi,. 

ba, . . . , bi6 (i4j = Ri) - Before computing bi, a subkey Ki is 
provided to a cipher function unit from a key scheduler. 

For eight cycieis before to/ data which is inputted byte- 
by-byte is gathere'ld ' in the input buffer register (IBR) . The 



left buffer register- (IBR(L)) remains ao and the right buffer 
register (IBR(R}) jlremains bo at [to-tj] . ^At a next clock, eacn 
of the input buffer registers gathers! one byte of a next 
plain text block jbo' and do- After eight clock cycles, the 
input buffer registers remain Co and do ,at [tig-tis] - 

The output bliffer registers (OBR) ' load from 21$ and 215 
itii, and output inverse permuted data az ti 



from AO and BO at 



20 



1 

byte-by-byte forj.S ^clock cycles. The data blocks of zie and 



Zi5 are remained 



ijp. ;the OBR, bi6 and bis from the registers A 
and B at tn are li'oaded and remained for eight clock cycles, 
and the inverse-lnitial-penrLUted data is outputtsd byte-by^ 



byte from ti7- 



ao and bo regi 
are accessed at h 



jijistered in the input buffer register (IBR) 
D-ta] , rhe unrolled loop cipher function is 

computed at [to'tM by using the subkeys Ki and Kz from the key 

! 

scheduler, bi and ,^2 can be stored in the registers A and B at 

I j ■ 

Since bi anj bz registered in the A and B can be 
accessed at [t2-tt4] r the unrolled loop cipher function is 
computed at [ta-tjj by using the subkey K3 and K4 from the key 
scheduler, bs and ib|4 can be stored in the registers A and B at 

t4 * ' I 

ii ^ 

Computation i>i bi and ba is started at to, and then, each 



of hzf h3r 

corresponding. regl 



operation of Co anfc- 
Fig- 14 is a| 



, bi5 is computed and stored at rhe 
sder- After eight clock cycles, bis ^nd bu 
are stored in the' registers A andi B at tie, thereby 
terminating DES ojjeration of ao and bo. Simultaneously/ DES 

dc is performed tie.* 

timing diagram illustrating effect of the 
DES architecture .Using a macro pipeline in accordance with 

• ! : 

the present invention. 

\ 

Referring tdli^ F;ig- 14, it shows jcomparison result of 
performances of tlrl^ 8-round pipeline DES architecture and the 
15-round DES arch^'tecture. Latency means a number of clock 
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cycles which iare -rjecessary from inpu-c of one plain text block 
to output of tone bipher text block through the DES encryption 
operation. Throuahpiit means a number o£ the plain text blocks 

■ ■ 'If I 

encrypted for, a cOJcck cycle. 
' . 11 ■ 

In case of [the conventional 16-round DES architecture 

: ff ; 

using no macro jsip'eline;. since the .input process and the 
output proces.s ta'ke' a clock cycles respectively and the DES 

encryption procesis Itakes 1^ clock cycles, a new plain text 

[| I 

block can be inputted at every 32 clock cycles. In this case, 
the latency is 32 |and the throughput is; 1/32. 

If 2-stage mapfo pipeline is introduced in the input and 
the output processes, the latency is 32 which is the same as 
that of the ' easel as mentioned above, however, the input 

li i 

process and the 4utput process of rhe encrypted data are 
simultaneously pejjfdrmed. Therefore, a new plain text block, 
can be inputted ajl every 24 clock cycles, and the throughput 
is 1/24. l| , 

If 3-sta'ge rliacro pipeline is introduced in the input 

process, the DBS 'j^ncryption process and the output process, 

\l ' 

since eight clock fcyples are idle in the input and the output 
processes respectively, the latency is 40 which is larger 
than that of th^^ pase as mentioned above, however, the 

h \ 

throughput is' 1/lB- ' l^a other words, a new plain text block 
can be inputted' aiid ' encrypted at every i 16 clock cycles. The 
DES architectures;! ubing the unrolled loop cipher function 

unit and the timenmultiplexed cipher function unit performs 

ii \ 

the input proeess,tf tjhe DES encryption process and the output 
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10 



process for :S clock cycles. If 3-stage inacro pipeline is 

introduced td the! DBS architectures using rhe unrolled loop 

ill 

cipher function liitit and the time multiplexed cipher function 
unit, the latency! is 24, the throughput is 1/8, and a new 

in ! 

plain text bloc)c Scan be inputted and,, encrypted at every 8 

Using the key scheduler having one key scheduling unit, 
the encryption apparatus has a small size, thereby reducing a 
cost of the encryption apparatus. ; 

Although' tha? preferred embodiments' of "che invention have 



been disclosed fok , illustrative purposes, those skilled in 



the art will a 



iate that various modifications, additions 



and substitutions^ are possible, without departing from the 
scope and spiritil pf the invention ^ as disclosed in the 
15 accompanying claiiiafe 

is 

■ ■ i! 



b i 
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